118 research outputs found

    Starr: Simple Tiling ARRay analysis of Affymetrix ChIP-chip data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chromatin immunoprecipitation combined with DNA microarrays (ChIP-chip) is an assay used for investigating DNA-protein-binding or post-translational chromatin/histone modifications. As with all high-throughput technologies, it requires thorough bioinformatic processing of the data for which there is no standard yet. The primary goal is to reliably identify and localize genomic regions that bind a specific protein. Further investigation compares binding profiles of functionally related proteins, or binding profiles of the same proteins in different genetic backgrounds or experimental conditions. Ultimately, the goal is to gain a mechanistic understanding of the effects of DNA binding events on gene expression.</p> <p>Results</p> <p>We present a free, open-source <b>R</b>/Bioconductor package <it>Starr </it>that facilitates comparative analysis of ChIP-chip data across experiments and across different microarray platforms. The package provides functions for data import, quality assessment, data visualization and exploration. <it>Starr </it>includes high-level analysis tools such as the alignment of ChIP signals along annotated features, correlation analysis of ChIP signals with complementary genomic data, peak-finding and comparative display of multiple clusters of binding profiles. It uses standard Bioconductor classes for maximum compatibility with other software. Moreover, <it>Starr </it>automatically updates microarray probe annotation files by a highly efficient remapping of microarray probe sequences to an arbitrary genome.</p> <p>Conclusion</p> <p><it>Starr </it>is an <b>R </b>package that covers the complete ChIP-chip workflow from data processing to binding pattern detection. It focuses on the high-level data analysis, e.g., it provides methods for the integration and combined statistical analysis of binding profiles and complementary functional genomics data. <it>Starr </it>enables systematic assessment of binding behaviour for groups of genes that are alingned along arbitrary genomic features.</p

    A simple and robust method for partially matched samples using the p -values pooling approach

    Get PDF
    This paper focuses on statistical analyses in scenarios where some samples from the matched pairs design are missing, resulting in partially matched samples. Motivated by the idea of meta-analysis, we recast the partially matched samples as coming from two experimental designs, and propose a simple yet robust approach based on the weighted Z-test to integrate the p-values computed from these two designs. We show that the proposed approach achieves better operating characteristics in simulations and a case study, compared to existing methods for partially matched samples

    Integrating Prior Knowledge in Multiple Testing under Dependence with Applications to Detecting Differential DNA Methylation

    Get PDF
    DNA methylation has emerged as an important hallmark of epigenetics. Numerous platforms including tiling arrays and next generation sequencing, and experimental protocols are available for profiling DNA methylation. Similar to other tiling array data, DNA methylation data shares the characteristics of inherent correlation structure among nearby probes. However, unlike gene expression or protein DNA binding data, the varying CpG density which gives rise to CpG island, shore and shelf definition provides exogenous information in detecting differential methylation. This paper aims to introduce a robust testing and probe ranking procedure based on a non-homogeneous hidden Markov model that incorporates the above-mentioned features for detecting differential methylation. We revisit the seminal work of Sun and Cai (2009, J. R. Stat. Soc. B. 71, 393-424) and propose modeling the non-null using a non-parametric symmetric distribution in two-sided hypothesis testing. We show that this model improves probe ranking and is robust to model misspecification based on extensive simulation studies. We further illustrate that our proposed framework achieves good operating characteristics as compared to commonly used methods in real DNA methylation data that aims to detect differential methylation sites

    A Statistical Framework for the Analysis of ChIP-Seq Data

    Get PDF
    Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) has revolutionalized experiments for genome-wide profiling of DNA-binding proteins, histone modifications, and nucleosome occupancy. As the cost of sequencing is decreasing, many researchers are switching from microarray-based technologies (ChIP-chip) to ChIP-Seq for genome-wide study of transcriptional regulation. Despite its increasing and well-deserved popularity, there is little work that investigates and accounts for sources of biases in the ChIP-Seq technology. These biases typically arise from both the standard pre-processing protocol and the underlying DNA sequence of the generated data

    A systematic assessment of normalization approaches for the Infinium 450K methylation platform

    Get PDF
    The Illumina Infinium HumanMethylation450 BeadChip has emerged as one of the most popular platforms for genome wide profiling of DNA methylation. While the technology is wide-spread, systematic technical biases are believed to be present in the data. For example, this array incorporates two different chemical assays, i.e., Type I and Type II probes, which exhibit different technical characteristics and potentially complicate the computational and statistical analysis. Several normalization methods have been introduced recently to adjust for possible biases. However, there is considerable debate within the field on which normalization procedure should be used and indeed whether normalization is even necessary. Yet despite the importance of the question, there has been little comprehensive comparison of normalization methods. We sought to systematically compare several popular normalization approaches using the Norwegian Mother and Child Cohort Study (MoBa) methylation data set and the technical replicates analyzed with it as a case study. We assessed both the reproducibility between technical replicates following normalization and the effect of normalization on association analysis. Results indicate that the raw data are already highly reproducible, some normalization approaches can slightly improve reproducibility, but other normalization approaches may introduce more variability into the data. Results also suggest that differences in association analysis after applying different normalizations are not large when the signal is strong, but when the signal is more modest, different normalizations can yield very different numbers of findings that meet a weaker statistical significance threshold. Overall, our work provides useful, objective assessment of the effectiveness of key normalization methods

    Three Molecular Subtypes of Gastric Adenocarcinoma Have Distinct Histochemical Features Reflecting Epstein-Barr Virus Infection Status and Neuroendocrine Differentiation

    Get PDF
    Current histopathologic classification schemes for gastric adenocarcinoma have limited clinical utility and are difficult to apply due to tumor heterogeneity. Elucidation of molecular subtypes of gastric cancer may contribute to our understanding of gastric cancer biology and to the development of new molecular markers that may lead to improved diagnosis, therapy, or prognosis. We previously demonstrated that Epstein-Barr virus infected gastric cancers have a distinct human gene expression profile compared to uninfected cancers. We now examine the histopathologic features characterizing infected (n=14) and uninfected (n=89) cancers, the latter of which are now further divided into two major molecular subtypes based on expression patterns of 93 RNAs. One uninfected gastric cancer subtype was distinguished by upregulation of three genes with neuroendocrine function (CHGA, GAST, and REG4 encoding chromogranin, gastrin and the secreted peptide REG4 involved in epithelial cell regeneration), implicating hormonal factors in the pathogenesis of a major class of gastric adenocarcinomas. Evidence of neuroendocrine differentiation (molecular, immunohistochemical, or morphologic) was mutually exclusive of EBV infection. EBV infected tumors tended to have solid-type morphology with lymphoid stroma. This study reveals novel molecular subtypes of gastric cancer and their associated morphologies that demonstrate divergent neuroendocrine features

    Racial Variation in Breast Tumor Promoter Methylation in the Carolina Breast Cancer Study

    Get PDF
    African American (AA) women are diagnosed with more advanced breast cancers and have worse survival than white women, but a comprehensive understanding of the basis for this disparity remains unclear. Analysis of DNA methylation, an epigenetic mechanism that can regulate gene expression, could help to explain racial differences in breast tumor clinical biology and outcomes

    IsoDOT Detects Differential RNA-isoform Expression/Usage with respect to a Categorical or Continuous Covariate with High Sensitivity and Specificity

    Get PDF
    We have developed a statistical method named IsoDOT to assess differential isoform expression (DIE) and differential isoform usage (DIU) using RNA-seq data. Here isoform usage refers to relative isoform expression given the total expression of the corresponding gene. IsoDOT performs two tasks that cannot be accomplished by existing methods: to test DIE/DIU with respect to a continuous covariate, and to test DIE/DIU for one case versus one control. The latter task is not an uncommon situation in practice, e.g., comparing paternal and maternal allele of one individual or comparing tumor and normal sample of one cancer patient. Simulation studies demonstrate the high sensitivity and specificity of IsoDOT. We apply IsoDOT to study the effects of haloperidol treatment on mouse transcriptome and identify a group of genes whose isoform usages respond to haloperidol treatment

    Mobile health applications: awareness, attitudes, and practices among medical students in Malaysia

    Get PDF
    Background The popularity of mobile health (mHealth) applications (or apps) in the field of health and medical education is rapidly increasing, especially since the COVID-19 pandemic. We aimed to assess awareness, attitudes, practices, and factors associated with the mHealth app usage among medical students. Methods We conducted a cross-sectional study involving medical students at a government university in Sarawak, Malaysia, from February to April 2021. Validated questionnaires were administered to all consenting students. These questionnaires included questions on basic demographic information as well as awareness, attitude toward, and practices with mHealth apps concerned with medical education, health and fitness, and COVID-19 management. Results Respondents had favorable attitudes toward mHealth apps (medical education [61.8%], health and fitness [76.3%], and COVID-19 management [82.7%]). Respondents’ mean attitude scores were four out of five for all three app categories. However, respondents used COVID-19 management apps more frequently (73.5%) than those for medical education (35.7%) and fitness (39.0%). Usage of all three app categories was significantly associated with the respondent’s awareness and attitude. Respondents in the top 20% in term of household income and study duration were more likely to use medical education apps. The number of respondents who used COVID-19 apps was higher in the top 20% household income group than in the other income groups. The most common barrier to the use of apps was uncertainty regarding the most suitable apps to choose. Conclusion Our study highlighted a discrepancy between awareness of mHealth apps and positive attitudes toward them and their use. Recognition of barriers to using mHealth apps by relevant authorities may be necessary to increase the usage of these apps

    DiffSplice: The Genome-Wide Detection of Differential Splicing Events with RNA-Seq

    Get PDF
    The RNA transcriptome varies in response to cellular differentiation as well as environmental factors, and can be characterized by the diversity and abundance of transcript isoforms. Differential transcription analysis, the detection of differences between the transcriptomes of different cells, may improve understanding of cell differentiation and development and enable the identification of biomarkers that classify disease types. The availability of high-throughput short-read RNA sequencing technologies provides in-depth sampling of the transcriptome, making it possible to accurately detect the differences between transcriptomes. In this article, we present a new method for the detection and visualization of differential transcription. Our approach does not depend on transcript or gene annotations. It also circumvents the need for full transcript inference and quantification, which is a challenging problem because of short read lengths, as well as various sampling biases. Instead, our method takes a divide-and-conquer approach to localize the difference between transcriptomes in the form of alternative splicing modules (ASMs), where transcript isoforms diverge. Our approach starts with the identification of ASMs from the splice graph, constructed directly from the exons and introns predicted from RNA-seq read alignments. The abundance of alternative splicing isoforms residing in each ASM is estimated for each sample and is compared across sample groups. A non-parametric statistical test is applied to each ASM to detect significant differential transcription with a controlled false discovery rate. The sensitivity and specificity of the method have been assessed using simulated data sets and compared with other state-of-the-art approaches. Experimental validation using qRT-PCR confirmed a selected set of genes that are differentially expressed in a lung differentiation study and a breast cancer data set, demonstrating the utility of the approach applied on experimental biological data sets. The software of DiffSplice is available at http://www.netlab.uky.edu/p/bioinfo/DiffSplice
    corecore